System and method for eye tracking during authentication

ABSTRACT

A system, apparatus, method, and machine readable medium are described for performing eye tracking during authentication. For example, one embodiment of a method comprises: receiving a request to authenticate a user; presenting one or more screen layouts to the user; capturing a sequence of images which include the user&#39;s eyes as the one or more screen layouts are displayed; and (a) performing eye movement detection across the sequence of images to identify a correlation between motion of the user&#39;s eyes as the one or more screen layouts are presented and an expected motion of the user&#39;s eyes as the one or more screen layouts are presented and/or (b) measuring the eye&#39;s pupil size to identify a correlation between the effective light intensity of the screen and its effect on the user&#39;s eye pupil size.

BACKGROUND

Field of the Invention

This invention relates generally to the field of data processingsystems. More particularly, the invention relates to a system and methodfor performing eye tracking techniques to improve authentication.

Description of Related Art

Systems have been designed for providing secure user authentication overa network using biometric sensors. In such systems, the score generatedby the application, and/or other authentication data, may be sent over anetwork to authenticate the user with a remote server. For example,Patent Application No. 2011/0082801 (“801 application”) describes aframework for user registration and authentication on a network whichprovides strong authentication (e.g., protection against identity theftand phishing), secure transactions (e.g., protection against “malware inthe browser” and “man in the middle” attacks for transactions), andenrollment/management of client authentication tokens (e.g., fingerprintreaders, facial recognition devices, smartcards, trusted platformmodules, etc).

In general, authentication techniques are robust against spoofing if (a)secret information is used for authentication or (b) it is hard toproduce a fake input. Most systems today rely on password-basedauthentication. Passwords are easy to reproduce, so they need to be keptsecure. Consequently, password attacks typically focus on gaining accessto a user's password. Recent attacks have demonstrated the vulnerabilityof servers on which the passwords are stored for verification.

In contrast to password-based authentication, when using biometrics forauthentication, the biometric information typically is public. Forexample, a fingerprint can be retrieved from (almost) any object touchedby the user. Similarly, a user's face is typically not hidden and hencecan be seen and captured by anyone and is often published on socialnetworks.

In the real world, we can rely on our own recognition abilities when wesee a person, because it is hard to “produce” another person having thesame biometric characteristics. For example, it is still hard to“produce” another person having the same face and mannerisms. This iswhy governments include pictures of the face in passports, ID cards,drivers licenses and other documents. In the virtual world, however, wedon't have to “produce” another person with the same face in order tospoof the system, but only something that the computer would recognizesuch as a picture of the face. In other words, “[t]he moral is thatbiometrics work well only if the verifier can verify two things: one,that the biometric came from the person at the time of verification, andtwo, that the biometric matches the master biometric on file” (seeReference 1 from the list of references provided prior to the claims ofthe present specification).

In the past, research on automatic face recognition has focused onreliable recognition of faces using still images and video. See, e.g.,Reference 2 below. Several relatively robust face recognition techniquesexist and systems are commercially available today (see Reference 3).However, little attention has been paid to “liveness” detection, i.e.,“verification . . . that the biometric matches the master biometric onfile.” In several use cases, spoofing protection is either not requiredor it is still being performed by humans (e.g., for law enforcementapplications).

The ubiquity of cameras in computing devices such as notebooks and smartphones on one hand, and the weakness of passwords as the most prevalentauthentication method on the other hand, drive the adoption of biometricauthentication methods in general, and face recognition in particular.The first large scale “trial” of face recognition as an authenticationmethod was done in Google Android 4 (aka, “Ice Cream Sandwich”) and wasbased on still image recognition. These techniques can be fooled easilywith photographs (See Reference 4). Even improved methods which includesome sort of liveness detection in Android 4.1 (aka, “Jelly Bean”) caneasily be spoofed by presenting two photos in a sequence, one with openeyes and an electronically modified one with closed eyes on a computerdisplay to the camera (see Reference 5).

Though it can be argued that this weakness is due to resourcelimitations on mobile devices, it also appears that commercial softwareavailable for PCs and even the research of anti-spoofing detection isnot yet very mature. The assignee of the present application performedtests with PC-based face recognition software which confirms thisfinding:

Cogent BioTrust 3.00.4063, operated on a Windows 7® based Samsung Series5® Notebook, performs no liveness check at all, even with securitysettings set to “high.” A simple face image, displayed on a normalcomputer monitor was sufficient to successfully spoof the system.

KeyLemon 2.6.5, operated on a Macbook Air® performs simple blink testsas liveness check. It can be successfully spoofed by displaying asequence of 3 images: (1) a real image of the face (e.g., created by aweb cam); (2) a modification of the real image, where the eyes have beenre-colored to look as if they are closed; (3) the real image again.

Anti-Spoofing detection is not part of standard tests such as the NISTbiometric vendor tests when comparing different algorithms. See, e.g.,References 6-8. One of the first known public competitions, organized byseveral researchers in 2011 (see Reference 9) showed early success ofsome algorithms, but it was based on videos with a resolution of 320×240pixels. Typical computing devices provide resolutions of thefront-facing cameras of at least 640×480 pixel.

FIG. 1 illustrates an exemplary client 120 with a biometric device 100for performing facial recognition. When operated normally, a biometricsensor 102 (e.g., a camera) reads raw biometric data from the user(e.g., snaps a photo of the user) and a feature extraction module 103extracts specified characteristics of the raw biometric data (e.g.,focusing on certain facial features, etc). A matcher module 104 comparesthe extracted features with biometric template data 110 stored in asecure storage on the client 120 and generates a score and/or a yes/noresponse based on the similarity between the extracted features and thebiometric template data 110. The biometric template data 110 istypically the result of an enrollment process in which the user enrollsa facial image or other biometric data with the device 100. Anapplication 105 may then use the score or yes/no result to determinewhether the authentication was successful.

There are multiple potential points of attack in order to spoof a facialrecognition system (see References 10, 11), identified in FIG. 1 as(1)-(8). There are well known protection mechanisms for ensuring theintegrity of the biometric templates (6) (e.g., by using electronicsignatures) and protecting the integrity of feature extraction (3),feature vector (4), the matcher (5) and its final result (8) (e.g., byapplying a combination of (a) white box encryption methods, (b) codeobfuscation and (c) device binding).

Protection mechanisms against replaying old captured data to the featureextraction unit (2) are (at least theoretically) covered by the approachof the Trusted Computing Group and by potential extensions to ARMTrustZone. Basically, the approach is to add cryptographic protectionmechanisms (e.g. HMAC or electronic signatures) to the sensor andencapsulate the sensor in a tamper proof way, similar to the protectionmechanisms used in current smart card chips. The feature extractionengine could then verify the integrity of the incoming data.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the present invention can be obtained from thefollowing detailed description in conjunction with the followingdrawings, in which:

FIG. 1 illustrates an exemplary client equipped with a biometric device;

FIG. 2 illustrates one embodiment of an authentication engine includingan eye tracking module and a facial recognition module;

FIG. 3 illustrates an exemplary heatmap for a Web page employed in oneembodiment of the invention;

FIGS. 4A-B illustrate exemplary text, graphics, photos, videos, blankregions and other content which may be displayed to an end user;

FIG. 5 illustrates one embodiment of a method for performingeye-tracking and facial recognition-based authentication;

FIGS. 6A-B illustrate different architectural arrangements within whichembodiments of the invention may be implemented.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Described below are embodiments of an apparatus, method, andmachine-readable medium for performing eye-tracking techniques duringauthentication. Throughout the description, for the purposes ofexplanation, numerous specific details are set forth in order to providea thorough understanding of the present invention. It will be apparent,however, to one skilled in the art that the present invention may bepracticed without some of these specific details. In other instances,well-known structures and devices are not shown or are shown in a blockdiagram form to avoid obscuring the underlying principles of the presentinvention.

The embodiments of the invention discussed below involve client deviceswith authentication capabilities such as biometric devices or PIN entry.These devices are sometimes referred to herein as “tokens,”“authentication devices,” or “authenticators.” While certain embodimentsfocus on facial recognition hardware/software (e.g., a camera andassociated software for recognizing a user's face and tracking a user'seye movement), some embodiments may utilize additional biometric devicesincluding, for example, fingerprint sensors, speaker recognitionhardware/software (e.g., a microphone and associated software forrecognizing a speaker), and optical recognition capabilities (e.g., anoptical scanner and associated software for scanning the retina of auser). The authentication capabilities may also include non-biometricdevices such as trusted platform modules (TPMs) and smartcards or secureelements.

As mentioned above, in a mobile biometric implementation, the biometricdevice may be remote from the relying party. As used herein, the term“remote” means that the biometric sensor is not part of the securityboundary of the computer it is communicatively coupled to (e.g., it isnot embedded into the same physical enclosure as the relying partycomputer). By way of example, the biometric device may be coupled to therelying party via a network (e.g., the Internet, a wireless networklink, etc) or via a peripheral input such as a USB port. Under theseconditions, there may be no way for the relying party to know if thedevice is one which is authorized by the relying party (e.g., one whichprovides an acceptable level of authentication and integrity protection)and/or whether a hacker has compromised the biometric device. Confidencein the biometric device depends on the particular implementation of thedevice.

One embodiment of the invention uses “normal” authentication techniques(e.g., capturing a sequence of images, swiping a finger, entering acode, etc) in order to train the authentication system to recognizenon-intrusive authentication situations. In addition, one embodimentreturns the authentication state of the device to the relying partyrather than sensitive information such as a Machine ID whenauthentication is required.

Techniques for Protecting Against Fake Biometrics

While the embodiments of the invention described below utilize eyetracking techniques to confirm the “liveness” of the user, in oneembodiment, these techniques are combined with one or more existingtechniques for detecting fake biometrics (see Reference 1). This is anarea of ongoing research. Existing research has identified fourdifferent classes of protection approaches for fake biometrics (seeReference 12):

1. Data-driven characterization

-   -   a. Still Images        -   i. Detect resolution degradation by re-scanning images            analyzing 2D Fourier spectrum (Reference 13)        -   ii. Exploiting different reflection characteristics of real            faces versus image prints. The theory of this is based on            the Lambertian reflectance properties (Reference 14)        -   iii. Exploiting different micro texture of real face and            image prints (Reference 15) due to printing defects.        -   iv. Exploiting quality degradation and noise addition on            printed images combined with other methods (Reference 16).    -   b. Videos        -   v. Each camera sensor has its own characteristics and            re-capturing a video displayed on a monitor causes            artifacts. This can be used to detect spoofing (Reference            12).        -   vi. In the case of spoofing with images, there is a            face-background dependency (Reference 17).        -   vii. In the case of spoofing attacks, faces typically show            more rigid motion (Reference 18).    -   c. Combinations of Still Images and Videos (Reference 12).

2. User behavior modeling (Reference 12).

3. User interaction need (Reference 12).

4. Additional devices (Reference 12).

The most effective non-intrusive mechanisms based solely on existingsensor technology seem to be based on a combination of Motion, Texture,and Liveness detection. See Reference 9.

Textural Differences

The impact on printing and re-scanning a picture may be detected. It isintuitively clear that the quality of an image doesn't improve byprinting and re-scanning it. The research in Reference 15 shows thatdifferences can be algorithmically detected by analyzing micro textures:“A close look at the differences between real faces and face printsreveals that human faces and prints reflect light in different waysbecause a human face is a complex non rigid 3D object whereas aphotograph can be seen as a planar rigid object.”

This algorithm has been tested against the images included in the NUAAPhotograph Imposter Database. The performance has been reported to be at16.5 ms in average to process an image on a 2.4 GHz Intel Core 2 Duo CPUwith 3 GB of RAM using un-optimized C++ code.

Infrared Instead of Visual Light

It is difficult to display images or videos in infrared spectrum. As aresult liveness detection based on capturing thermal patterns of facesas proposed in Reference 19 would be more robust than capturing patternsin visual light. Unfortunately infrared sensors are expensive and notincluded in typical notebooks, tablets or smart phones.

Optical Flow Based Methods

Real faces are 3 dimensional objects. Faces are typically moving innormal conversations. The 2D motion of the central face parts, i.e., theparts with less distance to the camera is expected to be higher comparedto the 2D motion of face regions with greater distance from the camera(References 20, 21, 22). For this type of detection a sequence of atleast 3 consecutive images is required.

The research in Reference 21 is part of the SART-2 project, a Biometricsecurity system for mobile workstations.

Motion Pictures Instead of Still Images

In Reference 23, a blinking-based liveness detection method isdescribed. This method seems to be pretty robust against simple photobased spoofing attacks. In addition to recognizing the face, the methodlocates the eyes and checks whether closing the eyes is visible in theobserved image sequence. As seen from the Android 4.1 large scale trial,this method is obviously not very robust against “photoshop” attacks.See Reference 5.

In general, in order to spoof such motion picture based systems theattacker must generate a small image sequence and must present thesequence to the sensor. In a world with powerful image editors, freevideo editors, and tablet PCs this is relatively easy to achieve.

Such methods are characterized as “publicly known interactions,” i.e.,the attacker knows the required interactions in advance and can preparea matching image sequence.

In Reference 23, the context of the scene and eye-blink is included inthe analysis. Performance measured on Intel Core2 Duo 2.8 GHz, 2 GB RAMis approximately 50 ms per video frame (20 fps).

Challenge Response Methods

In the context of biometrics, a challenge response is defined as:

-   A method used to confirm the presence of a person by eliciting    direct responses from the individual. Responses can be either    voluntarily or involuntarily. In a voluntary response, the end user    will consciously react to something that the system presents. In an    involuntary response, the end user's body automatically responds to    a stimulus. A challenge response can be used to protect the system    against attacks.-   (National Science & Technology Council's Subcommittee on Biometrics)    Multimodal Systems

Multimodal systems have been proposed to improve the robustness ofbiometric methods against spoofing attacks, noisy data etc. SeeReference 25.

The effect of simulated spoofing attacks to such multimodal systems isanalyzed in Reference 26. The main result is that not all fusion schemesimprove the robustness against spoofing attacks, meaning that in somefusion schemes it is sufficient to spoof only a single biometric methodin order to spoof the entire multimodal system. The analysis of existingschemes with real spoofing attacks lead to similar results. SeeReference 27.

In general, there are three different classes of multimodal systems:

-   -   1) Systems where successfully spoofing a single trait is        sufficient to spoof the entire system. Optimizing a multimodal        system for small FRRs typically leads to such results.    -   2) Systems where:        -   a) more than one trait has to be spoofed in order to            successfully spoof the entire system; and        -   b) spoofing any one trait in this multimodal system is no            more complex than spoofing the same trait in a single modal            system.    -   3) Systems where        -   a) more than one trait has to be spoofed in order to            successfully spoof the entire system; and        -   b) spoofing any one trait in this multimodal system is more            complex than spoofing the same trait in a single modal            system. The embodiments of the invention described below            fall into this category.

System and Method for Eye Tracking During Authentication

One embodiment of the invention performs eye-tracking as part of anauthentication process to measure the response to varying regions ofinterest randomly arranged and displayed on the screen. For example, asequence of random screen layouts mixing text, empty regions, images andvideo clips may be presented to the user to non-intrusively induceuser's eye-movement. Concurrently, eye-tracking techniques are used toverify that the eyes are reacting to the screen layout in an expectedmanner. This information may then be combined with face recognitiontechniques to verify that the expected face is still present. Moreover,as discussed above, the eye tracking and facial recognition techniquesmay be combined with other techniques (e.g., location-basedauthentication, non-intrusive user presence detection, fingerprintscanning, etc) to arrive at a sufficient level of assurance that thelegitimate user is in possession of the device.

Reading a Web page or other content type does not involve a smoothsweeping of the eyes along the contents, but a series of short stops(called “fixations”) and quick “saccades”. The resulting series offixations and saccades is called a “scanpath”. Scanpaths are useful foranalyzing cognitive intent, interest, and salience (see currentWikiPedia article for “Eye Tracking” aten.wikipedia.org/wiki/Eye_tracking). A “heatmap” is an aggregaterepresentation showing what areas a group of people fixated when viewinga webpage or email (see Hartzell, “Crazy Egg Heatmap Shows Where PeopleClick on Your Website” (Nov. 30, 2012), currently atwww.michaelhartzell.com/Blog/bid/92970/Crazy-Egg-Heatmap-shows-where-people-click-on-your-website).

As illustrated in FIG. 2, one embodiment of the invention comprises anauthentication engine 210 on a client device 200 which includes a facialrecognition module 204 for performing facial recognition and an eyetracking module 205 for performing the eye tracking operations describedherein. In one embodiment, the facial recognition module 204 and eyetracking module 205 analyze sequences of video images 203 captured by acamera 202 on the device to perform their respective operations.

To perform its facial recognition operations, the facial recognitionmodule 204 relies on facial recognition templates stored within a securefacial recognition database 246. In particular, as discussed above,matching logic within the facial recognition module 204 compares facialfeatures extracted from the video images 203 with facial template datastored in the facial recognition database 246 and generates a “score”based on the similarity between the extracted features and the facialtemplate data. As previously discussed, the facial template data storedin the database 246 may be generated by an enrollment process in whichthe user enrolls a facial image or other biometric data with the device200. The score generated by the facial recognition module 204 may thenbe combined with scores from other authentication modules (e.g., such aseye tracking module 205 discussed below) to form an assurance level 206,representing the assurance that the legitimate user is initiating thecurrent transaction. In one embodiment, each score must reach aparticular threshold value to generate a sufficient assurance level 206for a particular transaction. In one embodiment (assuming the thresholdsare reached), the scores may be added together or combined using othermathematical formulae (e.g., the scores may be weighted, averaged, addedtogether, or combined in any other way).

To perform eye tracking analysis, the eye tracking module 205 relies oneye tracking templates stored within a secure eye tracking database 245.Although illustrated as a separate database, the eye tracking databaseand facial recognition database may actually be the same securedatabase. In one embodiment, an eye tracking template specifies thetext, graphics, pictures, videos and/or blank regions which are to bedisplayed for the user on the client device's display 201 (some examplesof which are shown in FIGS. 4A-B below) and potentially the order inwhich the content is to be displayed. In addition, the eye trackingtemplate includes data specifying the expected motion characteristic ofa user's eyes in response to the content displayed to the user (e.g. inform of a heatmap, see below). Matching logic within the eye trackingmodule 205 compares the expected motion of the user's eyes with theactual motion (captured from the video images) to arrive at a “score”based on the similarity between the expected motion and the actualmotion. As mentioned, the score may then be combined with scores fromother authentication modules (e.g., such as facial recognition module204) to form an assurance level 206. The eye tracking template datastored in the database 246 may be compiled using recorded eye movementsof other users and/or of the actual user of the device in response toeach displayed Web page or other displayed image. For example, as withthe facial recognition template, the eye tracking template may begenerated as part of an enrollment process in which the user enrollshis/her eye motion with the device 200.

In one embodiment, the eye tracking module 205 determines thecorrelation between the images being displayed (which may include text,graphics, video, pictures, and/or blank regions) and the user's eyemovement. For example, if a motion video is displayed in the lower rightcorner of the display, the vast majority of users will direct theirattention to this region. Thus, if the eye tracking module 205 detectsthat the user's eyes have moved to this region within a designatedperiod of time (e.g., 2 seconds), then it will detect a high correlationbetween the user's eyes and the template, resulting in a relatively highscore. In contrast, if the user's eyes do not move to this region (or donot move at all), then the eye tracking module 205 will detect a lowcorrelation and corresponding low score.

As illustrated in FIG. 2, various other explicit user authenticationdevices 220-221 and sensors 243 may be configured on the client device200. These authentication devices and sensors may provide additionalauthentication data (if necessary) to be used by the authenticationengine 210 when generating the assurance level 206 (i.e., in addition tothe eye tracking and facial recognition described herein). For example,the sensors may include location sensors (e.g., GPS) to determine thelocation of the client device 200. If the client device is in anexpected location, then the authentication engine may use this data toincrease the assurance level 206. By contrast, if the client device isin an unusual location (e.g., another country), then this may negativelyimpact the assurance level 206. In this manner, authentication data maybe generated non-intrusively (i.e., using sensor data collected withoutexplicit input from the end user).

In addition, another non-intrusive technique involves the authenticationengine 210 monitoring the time which has passed since the last explicituser authentication. For example, if the user has authenticated using afingerprint or other biometric device 220-221 or has entered a passwordrecently (e.g., within 10 minutes), then it will use this information toincrease the assurance level 206. By contrast, if the user has notexplicitly authenticated for several days, then it may require morerigorous authentication by the facial recognition module 205 and eyetracking module 205 (e.g., it may require a higher correlation with thetemplate data than usual to increase the assurance level to anacceptable value for the current transaction).

In one embodiment, secure storage 225 is a secure storage deviceprovided for storing the authentication keys associated with each of theauthenticators and used by the secure communication module 213 toestablish secure communication with the relying party (e.g., a cloudservice 250 or other type of network service).

An exemplary “heatmap” generated for a Web page is illustrated in FIG.3. The color coding represents the regions of the Web page on whichusers fixed their eyes while viewing. Red indicates the highest amountof fixation (meaning that users tended to view these regions morefrequently), followed by yellow (indicating less fixation), blue(indicating still less fixation), and then no color (indicating nofixation or fixation below a threshold amount).

When designing web pages, eye tracking and heatmap analysis is performedas part of the usability analysis. Research (see, e.g., References 29,30) has shown that Web users spend 80% of their time looking atinformation above the page fold. Although users do scroll, they allocateonly 20% of their attention below the fold. Web users spend 69% of theirtime viewing the left half of the page and 30% viewing the right half. Aconventional layout is thus more likely to make sites profitable.

Spoofing attacks like presenting a still face image or a video displayedon a monitor can be detected by the eye tracking module 205 as thescanpath would most probably not correlate to the screen layout.Different types of Eye-Tracking methods are available: specializedequipment with high accuracy and software based methods using standardweb cams (see Reference 33).

FIG. 4A illustrates an exemplary grouping of text 405 and an imageand/or video 401 displayed on the client device display 201. In oneembodiment, the grouping is integrated into a Web page. However, theunderlying principles of the invention are not limited to a Web-basedorganization. The grouping could also be part of a Screen Saver or otherapplications. In one embodiment, the text 405 and image/video 401 aredisplayed concurrently. In another embodiment, the text is displayedfirst, followed by the image/video 401. In either case, the expectationis that the user's eyes would be directed to the lower right corner ofthe display 201 (where the image/video 401 is displayed).

FIG. 4B illustrates another example which includes a text region 405 andthree image/video elements 400-402. In one embodiment, the image/videoelement 400 is displayed first, followed by image/video element 401,followed by image/video element 402. In such a case, the user's eyesshould move from the upper right corner of the display, to the lowerright, and then to the lower left.

In one embodiment, the particular image/video elements 400-402 and othercontent types are randomly selected by the eye tracking module 205,thereby making it harder to anticipate and spoof. In addition, theparticular location in which the different image/video elements 400-402are selected randomly. In such a case, the eye motion template mayspecify a particular mode of operation for displaying content, but willnot specify the actual content o the actual location(s). Rather, thecontent and the locations are selected by the eye tracking module 205which will then assume that the user's eyes should gravitate towards thecontent being displayed and generate a correlation and score based onthe extent to which this is detected.

In addition, rather than generating its own content, the eye trackingmodule 205 may use existing content such as an existing Web page of therelying party 250 or images stored locally on the device. For example,if the relying party is a financial institution and the user isattempting to enter into a financial transaction, then the Web pagenormally displayed during the transaction may be displayed. In such acase, the eye tracking module 205 may retrieve a heatmap for the Webpage (such as shown in FIG. 3) from the eye tracking database 245 anddetermine whether a correlation exists to the heatmap and the locationsbeing viewed by the end user.

In summary, the embodiments described herein may present a sequence ofrandom screen layouts mixing text, empty regions, images and video clipsand continuously track the user's eyes producing the captured scanpath.A correlation is then made between the captured scanpath and theexpected scanpath. In addition, one embodiment of the invention may thenre-verify that the face is still recognized.

Not all people are equally attracted by the same images or imagesequences. For example some people are attracted by technology more thanthey are by animals, text, known or unknown human faces or bodies,mystic symbols, or even mathematical formulas. With this in mind, oneembodiment of the eye tracking module 205 learns the person specificcharacteristics of eye-movement triggered by different types of images.The degree of similarity of the measured characteristic from the videoimages 203 and the reference data (stored in the eye tracking database245) is then used to generate the assurance level 206 (i.e., thecertainty that the legitimate user's eyes are following “challenge”images, video, and other content displayed on the display 201).

A method in accordance with one embodiment of the invention isillustrated in FIG. 5. The method may be implemented within a systemarchitecture such as shown in FIG. 2, but is not limited to anyparticular system architecture.

At 501 a particular eye tracking template is selected for the given userand/or transaction and, at 502 a sequence of images of the user's faceare captured while displaying content according to the template. Forexample, the template may specify the types of content, the location ofthe content, and the timing for displaying the content. Alternatively,the template may only generally specify a type of eye-tracking and theeye tracking module 205 may determine how, where and when to display thecontent.

Regardless of how the content is selected and displayed, at 503, facialrecognition is performed and, at 504, eye tracking analysis is performedusing the captured sequence of images. At 505 a facial assurance levelis generated based on the correlation between the captured images andthe facial templates. Similarly, at 506, an eye tracking assurance levelis generated based on the correlation between the motion of the user'seyes and the expected motion of the user's eyes.

Although illustrated in FIG. 5 as parallel operations 503/505 and504/506, the facial recognition operations 503/505 may be performedfirst and the eye tracking operations 504/506 may then be performed onlyif the facial recognition operations result in a highcorrelation/assurance level (or vice-versa).

At 507, a determination is made as to whether the combined results ofthe facial authentication and eye tracking is sufficient to allow thecurrent transaction to proceed. If so, then the transaction is permittedat 509. If not, then at 508, the transaction is disallowed or additionalauthentication techniques are requested to raise the level of assurance.For example, at this stage, the user may be asked to swipe a finger on afingerprint sensor or to enter a PIN associated with the user's account.If the additional authentication techniques are sufficient, determinedat 510, then the transaction is permitted at 509.

Exemplary System Architectures

FIGS. 6A-B illustrate two embodiments of a system architecturecomprising client-side and server-side components for authenticating auser. The embodiment shown in FIG. 6A uses a browser plugin-basedarchitecture for communicating with a website while the embodiment shownin FIG. 6B does not require a browser. The various techniques describedherein for eye-tracking authentication and facial recognitionauthentication may be implemented on either of these systemarchitectures. For example, the authentication engine 210 shown in FIG.2 may be implemented as part of the secure transaction service 601(including interface 602) and/or the secure transaction plugin 605 orapplication 652. It should be noted, however, that the embodimentillustrated in FIG. 2 stands on its own and may be implemented usinglogical arrangements of hardware and software other than those shown inFIGS. 6A-B.

While the secure storage 620 is illustrated outside of the secureperimeter of the authentication device(s) 610-612, in one embodiment,each authentication device 610-612 may have its own integrated securestorage. Alternatively, each authentication device 610-612 maycryptographically protect the biometric reference data records (e.g.,wrapping them using a symmetric key to make the storage 620 secure).

Turning to FIG. 6A, the illustrated embodiment includes a client 600equipped with one or more authentication devices 610-612 for enrollingand authenticating an end user. As mentioned above, the authenticationdevices 610-612 may include biometric devices such as fingerprintsensors, voice recognition hardware/software (e.g., a microphone andassociated software for recognizing a speaker), facial recognitionhardware/software (e.g., a camera and associated software forrecognizing a user's face), and optical recognition capabilities (e.g.,an optical scanner and associated software for scanning the retina of auser) and non-biometric devices such as a trusted platform modules(TPMs) and smartcards.

The authentication devices 610-612 are communicatively coupled to theclient through an interface 602 (e.g., an application programminginterface or API) exposed by a secure transaction service 601. Thesecure transaction service 601 is a secure application for communicatingwith one or more secure transaction servers 632-633 over a network andfor interfacing with a secure transaction plugin 605 executed within thecontext of a web browser 604. As illustrated, the Interface 602 may alsoprovide secure access to a secure storage device 620 on the client 600which stores information related to each of the authentication devices610-612 such as a device identification code, user identification code,user enrollment data (e.g., scanned fingerprint or other biometricdata), and keys used to perform the secure authentication techniquesdescribed herein. For example, as discussed in detail below, a uniquekey may be stored into each of the authentication devices and used whencommunicating to servers 630 over a network such as the Internet.

As discussed below, certain types of network transactions are supportedby the secure transaction plugin 605 such as HTTP or HTTPS transactionswith websites 631 or other servers. In one embodiment, the securetransaction plugin is initiated in response to specific HTML tagsinserted into the HTML code of a web page by the web server 631 withinthe secure enterprise or Web destination 630 (sometimes simply referredto below as “server 630”). In response to detecting such a tag, thesecure transaction plugin 605 may forward transactions to the securetransaction service 601 for processing. In addition, for certain typesof transactions (e.g., such as secure key exchange) the securetransaction service 601 may open a direct communication channel with theon-premises transaction server 632 (i.e., co-located with the website)or with an off-premises transaction server 633.

The secure transaction servers 632-633 are coupled to a securetransaction database 640 for storing user data, authentication devicedata, keys and other secure information needed to support the secureauthentication transactions described below. It should be noted,however, that the underlying principles of the invention do not requirethe separation of logical components within the secure enterprise or webdestination 630 shown in FIG. 6A. For example, the website 631 and thesecure transaction servers 632-633 may be implemented within a singlephysical server or separate physical servers. Moreover, the website 631and transaction servers 632-633 may be implemented within an integratedsoftware module executed on one or more servers for performing thefunctions described below.

As mentioned above, the underlying principles of the invention are notlimited to a browser-based architecture shown in FIG. 6A. FIG. 6Billustrates an alternate implementation in which a stand-aloneapplication 654 utilizes the functionality provided by the securetransaction service 601 to authenticate a user over a network. In oneembodiment, the application 654 is designed to establish communicationsessions with one or more network services 651 which rely on the securetransaction servers 632-633 for performing the user/clientauthentication techniques described in detail below.

In either of the embodiments shown in FIGS. 6A-B, the secure transactionservers 632-633 may generate the keys which are then securelytransmitted to the secure transaction service 601 and stored into theauthentication devices within the secure storage 620. Alternatively, thesecure transaction service 601 might generate the keys which are thensecurely transmitted to the transaction servers 632-633. Additionally,the secure transaction servers 632-633 manage the secure transactiondatabase 640 on the server side.

Embodiments of the invention may include various steps as set forthabove. The steps may be embodied in machine-executable instructionswhich cause a general-purpose or special-purpose processor to performcertain steps. Alternatively, these steps may be performed by specifichardware components that contain hardwired logic for performing thesteps, or by any combination of programmed computer components andcustom hardware components.

Elements of the present invention may also be provided as amachine-readable medium for storing the machine-executable program code.The machine-readable medium may include, but is not limited to, floppydiskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs,RAMs, EPROMs, EEPROMs, magnetic or optical cards, or other type ofmedia/machine-readable medium suitable for storing electronic programcode.

Throughout the foregoing description, for the purposes of explanation,numerous specific details were set forth in order to provide a thoroughunderstanding of the invention. It will be apparent, however, to oneskilled in the art that the invention may be practiced without some ofthese specific details. For example, it will be readily apparent tothose of skill in the art that the functional modules and methodsdescribed herein may be implemented as software, hardware or anycombination thereof. Moreover, although some embodiments of theinvention are described herein within the context of a mobile computingenvironment, the underlying principles of the invention are not limitedto a mobile computing implementation. Virtually any type of client orpeer data processing devices may be used in some embodiments including,for example, desktop or workstation computers. Accordingly, the scopeand spirit of the invention should be judged in terms of the claimswhich follow.

REFERENCES

1. Biometrics: Uses and Abuses. Schneier, B. 1999. Inside Risks 110(CACM 42, 8, August 1999). http://www.schneier.com/essay-019.pdf.

2. Zhao, W., et al., et al. Face Recognition: A Literature Survey. ACMComputing Surveys, Vol. 35, No. 4. December 2003, pp. 399-458.

3. Andrea F. Abate, Michele Nappi, Daniel Riccio, Gabriele Sabatino. 2Dand 3D face recognition: A survey. Pattern Recognition Letters. 2007,28, pp. 1885-1906.

4. GSM Arena. GSM Arena. [Online] Nov. 13, 2011. [Cited: Sep. 29, 2012.]http://www.gsmarena.com/ice_cream_sandwichs_face_unlock_duped_using_a_photograph-news-3377.php.

5. James. Print Screen Mac. [Online] Aug. 6, 2012. [Cited: Sep. 28,2012.]http://printscreenmac.info/how-to-trick-android-jelly-bean-face-unlock/.

6. P. JONATHON PHILLIPS, PATRICK GROTHER, ROSS J. MICHEALS, DUANE M.BLACKBURN, ELHAM TABASSI, MIKE BONE. FACE RECOGNITION VENDOR TEST 2002:Evaluation Report. s.l.: NIST, 2002.http://www.face-rec.org/vendors/FRVT_2002_Evaluation_Report.pdf.

7. P. Jonathon Phillips, W. Todd Scruggs, Alice J. O'Toole, Patrick J.Flynn, Kevin W. Bowyer, Cathy L. Schott, Matthew Sharpe. FRVT 2006 andICE 2006 Large-Scale Results, NIST IR 7408. Gaithersburg: NIST, 2006.

8. Patrick J. Grother, George W. Quinn and P. Jonathon Philips, NIST.Report on the Evaluation of 2D Still-Image Face Recognition Algorithms,NIST IR 7709. s.l.: NIST, 2011.

9. Murali Mohan Chakka, André Anjos, Sébastien Marcel, Roberto Tronci,Daniele Muntoni, Gianluca Fadda, Maurizio Pili, Nicola Sirena, GabrieleMurgia, Marco Ristori, Fabio Roli, Junjie Yan, Dong Yi, Zhen Lei, ZhiweiZhang, Stan Z. Li, et. al. Competition on Counter Measures to 2-D FacialSpoofing Attacks. 2011.http://www.csis.pace.edu/˜ctappert/dps/IJCB2011/papers/130.pdf.978-1-4577-1359-0/11

10. Nalini K. Ratha, Jonathan H. Connell, and Ruud M. Bolle, IBM ThomasJ. Watson Research Center. An Analysis of Minutiae Matching Strength.Hawthorne, N.Y. 10532: IBM.http://pdf.aminer.org/000/060/741/an_analysis_of_minutiae_matching_strength.pdf.

11. Roberts, Chris. Biometric Attack Vectors and Defences. 2006.http://otago.ourarchive.ac.nz/bitstream/handle/10523/1243/BiometricAttackVectors.pdf.

12. Video-Based Face Spoofing Detection through Visual Rhythm Analysis.Allan da Silva Pinto, Helio Pedrini, William Robson Schwartz, AndersonRocha. Los Alamitos: IEEE Computer Society Conference PublishingServices, 2012. Conference on Graphics, Patterns and Images, 25.(SIBGRAPI).http://sibgrapi.sid.inpe.br/rep/sid.inpe.br/sibgrapi/2012/07.13.21.16?mirror=sid.inpe.br/banon/2001/03.30.15.38.24&metadatarepository=sid.inpe.br/sibgrapi/2012/07.13.21.16.53.

13. Jiangwei Li, Yunhong Wang, Tieniu Tan, A. K. Jain. Live FaceDetection Based on the Analysis of Fourier Spectra. Biometric Technologyfor Human Identification. 2004, pp. 296-303.

14. Xiaoyang Tan, Yi Li, Jun Liu and Lin Jiang. Face Liveness Detectionfrom A Single Image with Sparse Low Rank Bilinear Discriminative Model.s.l.: European Conference on Computer Vision, 2010. pp. 504-517.

15. Jukka Määttä, Abdenour Hadid, Matti Pietikäinen, Machine VisionGroup, University of Oulu, Finland. Face Spoofing Detection From SingleImages Using Micro-Texture Analysis. Oulu, Finland: IEEE, 2011.http://www.ee.oulu.fi/research/mvmp/mvg/files/pdf/131.pdf.

16. R. Tronci, D. Muntoni, G. Fadda, M. Pili, N. Sirena, G. Murgia, M.Ristori, and F. Roli. Fusion of Multiple Clues for Photo-AttackDetection in Face Recognition Systems. s.l.: Intl. Joint Conference onBiometrics, 2011. pp. 1-6.

17. Pietikäinen, Marko Heikkilä and Matti. A Texture-Based Method forModeling the Background and Detecting Moving Objects. Oulu: IEEE, 2005.http://www.ee.oulu.fi/mvg/files/pdf/pdf_662.pdf.

18. Yigang Peng, Arvind Ganesh, John Wright and Yi Ma. RASL: RobustAlignment by Sparse and Low-rank Decomposition for Linearly CorrelatedImages. IEEE Conference on Computer Vision and Pattern Recognition.2010, pp. 763-770. http://yima.csl.illinois.edu/psfile/RASL_CVPR10.pdf.

19. S. Kong, J. Heo, B. Abidi, J. Paik, and M. Abidi. Recent advances invisual and infrared face recognition—a review. Journal of ComputerVision and Image Understanding. June 2005, Vol. 1, 97, pp. 103-135.

20. K. Kollreider, H. Fronthaler and J. Bigun, Halmstad University,SE-30118, Sweden. Evaluating Liveness by Face Images and the StructureTensor. Halmstad, Sweden: s.n., 2005.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.62.6534&rep=rep1&type=pdf.

21. Maciej Smiatacz, Gdansk University of Technology. LIVENESSMEASUREMENTS USING OPTICAL FLOW FOR BIOMETRIC PERSON AUTHENTICATION.Metrology and Measurement Systems. 2012, Vol. XIX, 2.

22. Bao, Wei, et al., et al. A liveness detection method for facerecognition based on optical flow field. Image Analysis and SignalProcessing, IASP 2009. Apr. 11-12, 2009, pp. 233-236.http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5054589&isnumber=5054562.

23. Gang Pan, Zhaohui Wu and Lin Sun. Liveness Detection for FaceRecognition. [book auth.] Mislay Grgic and Marian Stewart BartlettKresimir Delac. Recent Advances in Face Recognition. Vienna: I-Tech,2008, p. 236 ff.

24. National Science & Technology Council's Subcommittee on Biometrics.Biometrics Glossary. NSTC.http://www.biometrics.gov/documents/glossary.pdf.

25. Jain, Arun Ross and Anil K. Multimodal Biometrics: An Overview.Proceedings of 12th European Signal Processing Conference (EUSIPCO).September 2004, pp. 1221-1224.http://www.csee.wvu.edu/˜ross/pubs/RossMultimodalOverview_EUSIPCO04.pdf.

26. R. N. Rodrigues, et al. Robustness of multimodal biometric fusionmethods against spoof attacks. Journal of Visual Language and Computing.2009. http://cubs.buffalo.edu/govind/papers/visual09.pdf.

27. Spoof Attacks on Multimodal Biometric Systems. Zahid Akhtar, SandeepKale, Nasir Alfarid. Singapore: IACSIT Press, Singapore, 2011. 2011International Conference on Information and Network Technology IPCSIT.Vol. 4. http://www.ipcsit.com/vol4/9-ICINT2011T046.pdf.

28. EyeTools. Part III: What is a heatmap . . . really? [Online] [Cited:Nov. 1, 2012.]http://eyetools.com/articles/p3-understanding-eye-tracking-what-is-a-heatmap-really.

29. Nielsen, Jakob. useit.com. Jakob Nielsen's Alertbox—Scrolling andAttention. [Online] Mar. 22, 2010. [Cited: Nov. 1, 2012.]http://www.useit.com/alertbox/scrolling-attention.html.

30. Nielsen, Jakib. useit.com. Jakob Nielsen's Alertbox—HorizontalAttention Leans Left. [Online] Apr. 6, 2010. [Cited: Nov. 1, 2012.]http://www.useit.com/alertbox/horizontal-attention.html.

31. Gus Lubin, Kim Bhasin and Shlomo Sprung. Business Insider. 16Heatmaps That Reveal Exactly Where People Look. [Online] May 21, 2012.[Cited: Nov. 1, 2012.]http://www.businessinsider.com/eye-tracking-heatmaps-2012-5?op=1.

32. Lin-Shung Huang, Alex Moshchuk, Helen J. Wang, Stuart Schechter,Collin Jackson. Clickjacking: Attacks and Defenses. s.l.: UsenixSecurity 2012, 2012.https://www.usenix.org/system/files/conference/usenixsecurityl2/sec12-final39.pdf.

33. Willis, Nathan. Linux.com. Weekend Project: Take a Tour of OpenSource Eye-Tracking Software. [Online] Mar. 2, 2012. [Cited: Nov. 1,2012.]https://www.linux.com/learn/tutorials/550880-weekend-project-take-a-tour-of-open-source-eye-tracking-software.

34. Girija Chetty, School of ISE, University of Canberra, Australia.Multilevel liveness verification for face-voice biometricauthentication. BYSM-2006 Symposium. Baltimore: s.n., Sep. 19, 2006.http://www.biometrics.org/bc2006/presentations/Tues_Sep_19/BSYM/19_Chetty_research.pdf.

35. P. A. Tresadern, C. McCool, N. Poh, P. Matejka, A. Hadid, C. Levy,T. F. Cootes and S. Marcel. Mobile Biometrics (MoBio): Joint Face andVoice Verification for a Mobile Platform. 2012.http://personal.ee.surrey.ac.uk/Personal/Norman.Poh/data/tresadern_PervComp2012_draft.pdf.

36. Arabnia, Rabia Jafri and Hamid R. A Survey of Face RecognitionTechniques. Journal of Information Processing Systems, Vol. 5, No. 2,June 2009. June 2009, Vol. 5, 2, pp. 41-68.http://www.cosy.sbg.ac.at/˜uhl/face_recognition.pdf.

37. Himanshu, Sanjeev Dhawan, Neha Khurana. A REVIEW OF FACERECOGNITION. International Journal of Research in Engineering & AppliedSciences. February 2012, Vol. 2, 2, pp. 835-846.http://euroasiapub.org/IJREAS/Feb2012/81.pdf.

38. BIOMETRIC IMAGE PROCESSING AND RECOGNITION. P. Jonathon Phillips, R.Michael McCabe, and Rama Chellappa. 1998. Eusipco.

39. Chellappa, Shaohua Kevin Zhou and Rama; Face Recognition from StillImages and Videos. University of Maryland, College Park, Md. 20742.Maryland: s.n., 2004.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.77.1312&rep=rep1&type=pdf.

40. George W. Quinn, Patrick J. Grother, NIST. Performance of FaceRecognition Algorithms on Compressed Images, NIST Inter Agency Report7830. s.l.: NIST, 2011.

41. The Extended M2VTS Database. [Online] [Cited: Sep. 29, 2012.]http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/.

42. N. K. Ratha, J. H. Connell, R. M. Bolle, IBM. Enhancing security andprivacy in biometrics-based authentication systems. IBM Systems Journal.2001, Vol. 40, 3.

43. Schuckers, Stephanie A. C. Spoofing and Anti-Spoofing Measures.Information Security Technical Report. 2002, Vol. 7, 4.

44. William Robson Schwartz, Anderson Rocha, Helio Pedrini. FaceSpoofing Detection through Partial Least Squares and Low-LevelDescriptors. s.l.: Intl. Joint Conference on Biometrics, 2011. pp. 1-8.

45. Edited by Kresimir Delac, Mislay Grgic and Marian Stewart Bartlett.s.l.: InTech, 2008.http://cdn.intechopen.com/finals/81/InTech-Recent_advances_in_face_recognition.zip.ISBN 978-953-7619-34-3.

46. Gang Pan, Lin Sun, ZhaohuiWu, YuemingWang. Monocular camera-basedface liveness detection by combining eyeblink and scene context. s.l.:Springer Science+Business Media, L L C, 2010.http://www.cs.zju.edu.cn/˜gpan/publication/2011-TeleSys-liveness.pdf.

47. Roberto Tronci, Daniele Muntoni, Gianluca Fadda, Maurizio Pili,Nicola Sirena, Gabriele Murgia, Marco Ristori, Fabio Roli. Fusion ofmultiple clues for photo-attack detection in face recognition systems.09010 Pula (CA), Italy: s.n., 2011.http://prag.diee.unica.it/pra/system/files/Amilab_IJCB2011.pdf.

48. Anderson Rocha, Walter Scheirer, Terrance Boult, Siome Goldenstein.Vision of the Unseen: Current Trends and Challenges in Digital Image andVideo Forensics. s.l.: ACM Computing Surveys, 2010.http://www.wjscheirer.com/papers/wjs_csur2011_forensics.pdf.

49. Ernie Brickell, Intel Corporation; Jan Camenish, IBM Research; LiqunChen, HP Laboratories. Direct Anonymous Attestation. 2004.http://eprint.iacr.org/2004/205.pdf.

What is claimed is:
 1. A method comprising: receiving a request toauthenticate a user, the request generated responsive to a transactioninitiated by the user; selecting an eye tracking template based on theuser and/or the transaction; capturing a sequence of images of theuser's face while displaying content to the user, the content generatedaccording to the eye tracking template; performing facial recognition onthe user's face; generating a facial assurance level based on acorrelation between the captured sequence of images and facial templatedata associated with the user; performing eye tracking analysis usingthe captured sequence of images; generating an eye tracking assurancelevel based on a correlation between motion of the user's eyes and anexpected motion of the user's eyes; allowing the transaction to proceedwhen a combined result of the facial assurance level and the eyetracking assurance level is sufficient to allow the transaction toproceed; and disallowing the transaction and/or performing additionalauthentication techniques to raise the level of assurance when thecombined result is not sufficient to allow the transaction to proceed.2. The method of claim 1, wherein the eye tracking template specifies atype of content, a location of content, and a timing for displayingcontent.
 3. The method of claim 1, wherein the eye tracking templatespecifies a type of eye-tracking and an eye tracking module determineshow, where and when to display the content.
 4. The method of claim 1,wherein the eye tracking analysis is performed only if facial assurancelevel is above a specified threshold.
 5. The method of claim 1, whereinadditional authentication techniques comprise the user swiping a fingeron a fingerprint sensor and/or entering a personal identification number(PIN) associated with the user's account.
 6. The method of claim 1,wherein displaying content to the user comprises displaying one or moregraphics images, photographs, or motion video images in designatedregions of a display.
 7. The method of claim 1, wherein the transactionis an online transaction with between the user and a remote relyingparty.
 8. The method as in claim 1, wherein the expected motion of theuser's eyes is based on learned characteristics of eye-movementtriggered by different types of images.
 9. An apparatus comprising: anauthentication engine to receive a request to authenticate a user,wherein the request is generated responsive to a transaction initiatedby the user and the authentication engine is further to select an eyetracking template based on the user and/or the transaction; a camera tocapture a sequence of images of the user's face while content displayedto the user, the content generated according to the eye trackingtemplate; a facial recognition device to perform facial recognition onthe user's face; an eye tracking hardware module to performing eyetracking analysis using the captured sequence of images; wherein theauthentication engine is further to: generate a facial assurance levelbased on a correlation between the captured sequence of images andfacial template data associated with the user; generate an eye trackingassurance level based on a correlation between motion of the user's eyesand an expected motion of the user's eyes; allow the transaction toproceed when a combined result of the facial assurance level and the eyetracking assurance level is sufficient to allow the transaction toproceed; and disallow the transaction and/or perform additionalauthentication techniques to raise the level of assurance when thecombined result is not sufficient to allow the transaction to proceed.10. The apparatus of claim 9, wherein the eye tracking templatespecifies a type of content, a location of content, and a timing fordisplaying content.
 11. The apparatus of claim 9, wherein the eyetracking template specifies a type of eye-tracking and an eye trackingmodule determines how, where and when to display the content.
 12. Theapparatus of claim 9, wherein the eye tracking analysis is performedonly if facial assurance level is above a specified threshold.
 13. Theapparatus of claim 9, wherein additional authentication techniquescomprise the user swiping a finger on a fingerprint sensor and/orentering a personal identification number (PIN) associated with theuser's account.
 14. The apparatus of claim 9, wherein displaying contentto the user comprises displaying one or more graphics images,photographs, or motion video images in designated regions of thedisplay.
 15. The apparatus of claim 9, wherein the transaction is anonline transaction with between the user and a remote relying party. 16.The apparatus of claim 9, wherein the expected motion of the user's eyesis based on learned characteristics of eye-movement triggered bydifferent types of images.